Sex and statin-related genetic associations at the PCSK9 gene locus: results of genome-wide association meta-analysis

Background Proprotein convertase subtilisin/kexin type 9 (PCSK9) is a key player of lipid metabolism with higher plasma levels in women throughout their life. Statin treatment affects PCSK9 levels also showing evidence of sex-differential effects. It remains unclear whether these differences can be explained by genetics. Methods We performed genome-wide association meta-analyses (GWAS) of PCSK9 levels stratified for sex and statin treatment in six independent studies of Europeans (8936 women/11,080 men respectively 14,825 statin-free/5191 statin-treated individuals). Loci associated in one of the strata were tested for statin- and sex-interactions considering all independent signals per locus. Independent variants at the PCSK9 gene locus were then used in a stratified Mendelian Randomization analysis (cis-MR) of PCSK9 effects on low-density lipoprotein cholesterol (LDL-C) levels to detect differences of causal effects between the subgroups. Results We identified 11 loci associated with PCSK9 in at least one stratified subgroup (p < 1.0 × 10–6), including the PCSK9 gene locus and five other lipid loci: APOB, TM6SF2, FADS1/FADS2, JMJD1C, and HP/HPR. The interaction analysis revealed eight loci with sex- and/or statin-interactions. At the PCSK9 gene locus, there were four independent signals, one with a significant sex-interaction showing stronger effects in men (rs693668). Regarding statin treatment, there were two significant interactions in PCSK9 missense mutations: rs11591147 had stronger effects in statin-free individuals, and rs11583680 had stronger effects in statin-treated individuals. Besides replicating known loci, we detected two novel genome-wide significant associations: one for statin-treated individuals at 6q11.1 (within KHDRBS2) and one for males at 12q24.22 (near KSR2/NOS1), both with significant interactions. In the MR of PCSK9 on LDL-C, we observed significant causal estimates within all subgroups, but significantly stronger causal effects in statin-free subjects compared to statin-treated individuals. Conclusions We performed the first double-stratified GWAS of PCSK9 levels and identified multiple biologically plausible loci with genetic interaction effects. Our results indicate that the observed sexual dimorphism of PCSK9 and its statin-related interactions have a genetic basis. Significant differences in the causal relationship between PCSK9 and LDL-C suggest sex-specific dosages of PCSK9 inhibitors. Supplementary Information The online version contains supplementary material available at 10.1186/s13293-024-00602-6.


LURIC
The Ludwigshafen Risk and Cardiovascular Health (LURIC) study is a monocentric hospital based prospective study including 3316 individuals referred for coronary angiography recruited in the Ludwigshafen Cardiac Center, southwestern Germany from 1997 -2000 (3).Clinical indications for angiography were chest pain or a positive non-invasive stress test suggestive of myocardial ischemia.To limit clinical heterogeneity, individuals suffering from acute illnesses other than acute coronary syndrome, chronic non-cardiac diseases and a history of malignancy within the five past years were excluded.All participants were completed a detailed questionnaire which gathered information on medical history, clinical, and lifestyle factors.Fasting blood samples were obtained by venipuncture in the early morning and stored for later analyses.Information on vital status during follow-up was obtained from local registries.Death certificates, medical records of local hospitals, and autopsy data were reviewed independently by two experienced clinicians who were blinded to patient characteristics and who classified the causes of death.Study protocols were approved by the ethics committee of the "Landesärztekammer Rheinland-Pfalz" and the study was conducted in accordance with the "Declaration of Helsinki".Informed written consent was obtained from all participants.Samples were genotyped on the Affymetrix 6.0 array and the Illumina 200K Metabochip array.Both datasets were merged before imputation.Variants with a call rate less than 0.98, Hardy-Weinberg Equilibrium P < 5x10-4, and MAF < 0.01 were removed.Imputation was performed using the 1000 Genomes Project phase 3 reference panel with Minimac.

TwinGene
The TwinGene project, conducted between 2004 and 2008, is a population-based Swedish study of twins born between 1911 and 1958 (4).The study participants have previously participated in a telephone interview called Screening Across the Lifespan Twin Study, conducted between 1998 and 2002.To be included in TwinGene, both twins had to be alive.The zygosity of the twins was based on self-reported childhood resemblance, or by using DNA markers (for 18% of the total sample).In total, 12,591 individuals participated by donating blood to the study, and by answering questionnaires about life style and health.The study was approved by the local ethics committee at Karolinska Institutet and all participants gave informed consent.DNA from 9,896 individual subjects was sent to SNP&SEQ Technology Platform Uppsala, Sweden for genome wide genotyping with Illumina OmniExpress bead chip (all available dizygous twins + one twin from each available monozygotic twin pair).Genotyping results for 9,836 subjects and 731,442 SNPs passed the initial lab-based quality control.SNPs were filtered when missing more than 3% information, with minor allele frequency less than 1% or deviation from Hardy-Weinberg equilibrium (p<=1x10 -7 ).Subjects were filtered when missing more than 3%, showing cryptic relatedness or a deviation in heterozygosity of more than five SD from the population mean.The heterozygosity of chromosome X was used to check the sex.After QC there were 9,617 individuals and 644,556 SNPs remaining.Before phasing, we filtered duplicates and removed the pseud-oautosomal region, and aligned the dataset to the reference panel 1000 Genomes Project phase 3 using shapeit2.Imputation was then performed with Eagle2.To adjust for the relatedness structure in TwinGene, the genetic relationship matrices (GRMs) per chromosome were estimated and combined via PLINK 2.0.The genetic association with PCSK9 was then estimated with a mixed linear model leaving one chromosome out (MLM-LOCO) implemented in GCTA (5,6).Here, the analyzed chromosome is excluded from the GRM.

GCKD
The GCKD (German Chronic Kidney Disease) study is an ongoing prospective cohort study of 5,217 individuals suffering from moderate chronic kidney disease enrolled between 2010 and 2012.Details on the study design and patient characteristics have been published earlier (7).Patients were eligible if they were 18-74 years old, of Caucasian ancestry and presented with any of the following conditions: an estimated glomerular function rate (eGFR) of 30-60 ml/min/1.73m 2 (Kidney Disease Improving Global Outcomes [KDIGO] stage G3, A1-A3) or an eGFR >60 ml/min/1.73m 2 in the presence of overt proteinuria defined by a urine albumin-creatinine ratio (UACR) >300 mg/g or equivalent (KDIGO stage G1-G2, A3) under regular care by nephrologists.Every participant provided written informed consent.The study was approved by the ethics committees of each participating study center.Askimed, a cloudbased web platform, was used for collection and management of data (https://www.askimed.com).
Genotype data were obtained from the GCKD study using Illumina HumanOmni2.5-8 v1.2 BeadChip (Illumina, GenomeStudio, Genotyping Module Version 1.9.4).Samples were excluded if the call rate per sample was <0.97, if there was a sex mismatch or if they failed mean heterozygosity, genetic ancestry and cryptic relatedness checks.SNPs were excluded prior to imputation if the call rate was <0.96, if positions were duplicated or if they deviated from the Hardy-Weinberg equilibrium (p <10 -05 ).

KORA-F3
The KORA F3 study (Cooperative Health Research in the Region of Augsburg) is a follow-up study of KORA S3, which is a population-based adult cohort study and considered as random representative sample of Southern Germany (Augsburg) (10).The inclusion criteria were age 25-74, German nationality and residence in Augsburg or surrounding counties.The KORA F3 study was conducted in 2004/2005 and included 3,184 participants (11).
A written informed consent form was obtained from every participant and the Bavarian Medical Association ethics committee approved the study.
In KORA F3 study, samples were genotyped with Illumina Omni 2.5 and Illumina Omni Express array.Samples were excluded if they had a call rate <97%.SNPs were excluded if they deviated from the Hardy-Weinberg equilibrium (p <10 -06 ), had a call rate <0.98, a minor allele frequency <0.01 or if they were only available on one chip.
The Michigan imputation Server (https://imputationserver.sph.umich.edu/index.html#)(8) was used for imputing the genotypes using the Haplotype Reference Consortium panel (HRC r1.1 2016) (9).We included the data of six studies of European descent.All participating studies provided GWAS summary statistics stratified for both statin-treatment and sex.In the first round of meta-analyses, we combined the study-wise data for the double-stratified subgroups.In the second round of meta-analyses, we combined pairwise strata to estimate the single-strata SNP effects, for example statin-free and statin-treated females combined to estimate SNP effects in females.Associated loci (p<1x10 -6 ) were then tested for sex-and statininteractions.All loci were annotated with candidate genes and tested for colocalization with gene expression and lipid data.For the PCSK9 gene locus, we performed fine-mapping using GCTA COJO.Finally, we compared the causal effects of PCSK9 on LDL-C using the subgroup-specific effect estimates of four SNPs at the PCSK9 gene locus.Figure S3: Regional association plots at PCSK9 gene.For each subgroup, an RA plot is given.In all plots, the lead SNP rs11591147 is plotted in blue.SNPs in LD with this variant are plotted in yellow (LD r 2 ranging between 0.1 and 0.5).Independent variants as identified by GCTA COJO select are encircled in red.

Figure S4: LD-Matrix plot generated by LDlink.
We included all seven SNPs that were selected as independent signals in one of the subgroups to test their pairwise LD using the European reference set.The lower triangle in red indicates LD r 2 , while the upper triangle in blue indicates D'.There are four LD-clusters visible, and their best-associated SNPs per cluster are rs2495491, rs11591147, rs11583680, and rs693668.LD between the clusters is low (r 2 <0.05), and LD within the cluster is high (r 2 >0.7).

Figure S5
: Forest Plots of the four independent SNPs over the eight subgroups.Each SNP and subgroup are plotted using the GWAS (unconditional) beta estimates and 95% confidence intervals (CI).Subgroups are sorted by increasing beta estimates per SNP (different sorting per SNP).A) rs11591147 (lead SNP) with significant statin-interaction B) rs693668 with sex-interaction C) rs11583680 with statin-interaction (males treated vs males free) D) rs2495491 without interaction.

Figure S7
: Forest Plots for the novel loci over the eight subgroups.Each SNP and subgroup are plotted using the GWAS beta estimates and 95% confidence intervals (CI).Subgroups are sorted by increasing beta estimates per SNP (different sorting per SNP).A) rs4767549 (NOS1/KSR2) with significant sexinteraction B) rs3076276 (KHDRBS2) with significant statin-interaction C) rs76849715 (ALOX5) with significant sex-and statin-interaction D) rs4763806 (SLCO1B1) with significant sex-interaction E) rs34924001 (PRKAG2) with significant sex-and statin-interaction.In each subgroup and SNP, we estimated the Wald ration and used the first term of the delta method for the standard error.Then we combined the single SNP estimates in an inverse-variance-weighted (IVW) meta-analysis (fixed effect), and tested for heterogeneity leaving one SNP out ("w/o SNP x").Throughout all subgroups, the causal estimate of rs11583680 is weaker than the other three introducing heterogeneity in the IVW analysis.A) Statin-free subjects B) Statin-treated subjects C) Males D) Females E) Statin-free males F) Statin-free females G) Statintreated males H) Statin-treated females.

Figure S1 :
FigureS1: Flowchart of the stratified genome-wide analyses.We included the data of six studies of European descent.All participating studies provided GWAS summary statistics stratified for both statin-treatment and sex.In the first round of meta-analyses, we combined the study-wise data for the double-stratified subgroups.In the second round of meta-analyses, we combined pairwise strata to estimate the single-strata SNP effects, for example statin-free and statin-treated females combined to estimate SNP effects in females.Associated loci (p<1x10 -6 ) were then tested for sex-and statininteractions.All loci were annotated with candidate genes and tested for colocalization with gene expression and lipid data.For the PCSK9 gene locus, we performed fine-mapping using GCTA COJO.Finally, we compared the causal effects of PCSK9 on LDL-C using the subgroup-specific effect estimates of four SNPs at the PCSK9 gene locus.

Figure S2 :
Figure S2: Manhattan Plot of all eight subgroups (min.p-value per SNP).The y-axis was limited to 20, and all SNPs with higher values set to 20 (max.original log10(p)=143.8).Color indicates the subgroup with the lowest p-value for each SNP with log10(p)>6.The 11 loci with sufficient support (3 or more associated SNPs) are labeled.The red dashed horizontal line indicates genome-wide significance (p<5x10 -8 ), while the blue dotted line indicates suggestive significance (p<1x10 -6 ).

Figure S10 :
Figure S10: Directed acyclic graph (DAG) for Mendelian Randomization (MR).We analyzed the causal effect of PCSK9 on LDL-C, stratified by sex and statin treatment.Statin treatment induces indirectly gene expression of LDLR and PCSK9.PCSK9 increases the degradation of LDLR and hence increases LDL-C plasma levels.Biological sex is a known risk factor for both PCSK9 and LDL-C.

Figure S11 :
Figure S11: Forest Plot of the causal estimates per subgroup.In each subgroup and SNP, we estimated the Wald ration and used the first term of the delta method for the standard error.Then we combined the single SNP estimates in an inverse-variance-weighted (IVW) meta-analysis (fixed effect), and tested for heterogeneity leaving one SNP out ("w/o SNP x").Throughout all subgroups, the causal estimate of rs11583680 is weaker than the other three introducing heterogeneity in the IVW analysis.A) Statin-free subjects B) Statin-treated subjects C) Males D) Females E) Statin-free males F) Statin-free females G) Statintreated males H) Statin-treated females.